Automatic Speaker Recognition Using Mel-Frequency Cepstral Coefficients Through Machine Learning
نویسندگان
چکیده
Automatic speaker recognition (ASR) systems are the field of Human-machine interaction and scientists have been using feature extraction matching methods to analyze synthesize these signals. One most commonly used for is Mel Frequency Cepstral Coefficients (MFCCs). Recent researches show that MFCCs successful in processing voice signal with high accuracies. represents a sequence signal-specific features. This experimental analysis proposed distinguish Turkish speakers by extracting from speech recordings. Since human perception sound not linear, after filterbank step MFCC method, we converted obtained log filterbanks into decibel (dB) features-based spectrograms without applying Discrete Cosine Transform (DCT). A new dataset was created spectrogram 2-D array. Several learning algorithms were implemented 10-fold cross-validation method detect speaker. The highest accuracy 90.2% achieved Multi-layer Perceptron (MLP) tanh activation function. important output this study inclusion as set.
منابع مشابه
Mel Frequency Cepstral Coefficients for Speaker Recognition Using Gaussian Mixture Model-Artificial Neural Network Model
Speaker Recognition (SP) is a topic of great significance in areas of intelligent and security. In Biometric SP using automated method of verifying or recognizing the identity of the person on the basis of some application, such as a finger print or face pattern and human voice. Many method have been proposed in the literature are focusing on front end processing such as PLP and LPC. In this pa...
متن کاملThe Capacity of Mel Frequency Cepstral Coefficients for Speech Recognition
Speech recognition is of an important contribution in promoting new technologies in human computer interaction. Today, there is a growing need to employ speech technology in daily life and business activities. However, speech recognition is a challenging task that requires different stages before obtaining the desired output. Among automatic speech recognition (ASR) components is the feature ex...
متن کاملMel, linear, and antimel frequency cepstral coefficients in broad phonetic regions for telephone speaker recognition
We’ve examined the speaker discriminative power of mel-, antimeland linear-frequency cepstral coefficients (MFCCs, aMFCCs and LFCCs) in the nasal, vowel, and non-nasal consonant speech regions. Our inspiration came from the work of Lu and Dang in 2007, who showed that filterbank energies at some frequencies mainly outside the telephone bandwidth possess more speaker discriminative power due to ...
متن کاملGeneralized mel frequency cepstral coefficients for large-vocabulary speaker-independent continuous-speech recognition
The focus of a continuous speech recognition process is to match an input signal with a set of words or sentences according to some optimality criteria. The first step of this process is parameterization, whose major task is data reduction by converting the input signal into parameters while preserving virtually all of the speech signal information dealing with the text message. This contributi...
متن کاملMel Frequency Cepstral Coefficients for Music Modeling
We examine in some detail Mel Frequency Cepstral Coefficients (MFCCs) the dominant features used for speech recognition and investigate their applicability to modeling music. In particular, we examine two of the main assumptions of the process of forming MFCCs: the use of the Mel frequency scale to model the spectra; and the use of the Discrete Cosine Transform (DCT) to decorrelate the Mel-spec...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Computers, materials & continua
سال: 2022
ISSN: ['1546-2218', '1546-2226']
DOI: https://doi.org/10.32604/cmc.2022.023278